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ABSTRACT 

This study was conducted to determine whether the 
Chapter 1 instruction provided to low-achieving students in the 
Oklahoma City Public Schools had a detectable effect on student 
academic growth. Reading and math achievement test scores of Chapter 
1 students in grades 2 to 8 were compared with those of a matched 
group of low-achieving non-Chapter 1 students over 2 years (1981-32 
and 1982-83). Chapter 1 students were provided with 30-50 minutes 
day of extra remedial instruction in reading or math, and the - - 
achievement of both groups was measured by the California Achievement 
Test (CAT). An Analysis of Covariance (ANCOVA) procedure was used to 
address aptitude-treatment interaction (ATI) questions of whether 
Chapter 1 treatment affected students differently, depending on their 
pretreatment achievement standing4 Results, portrayed by a series of 
graphs and tables, indicate that for nearly all grades, 23 out of 
within-grade comparisons showed slope differences in the same 
direction. These results indicate that initially lower achievers 
benefit more from Chapter 1 treatment than initially higher 
achievers. Implications of these findings for Chapter 1 selection 
criteria are discussed. (TE) 
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Aptitude-Treatment Interactions 
in Student Achievement: 
Implications for Program Policy Decisions 

Evaluators are faced with choices among many different methods 
of analyzing compensatory education (e.g.. Chapter I) st-udents' 
academic growth. The most common method is the Model A (Tallmadge 
and Wood, 1976) "raw" NCE (Normal Curve Equivalent) gains analysis. 
Those results address the comparison of Chapter I students' 
progress in relation to a national norming sample. Work by Kimball 
and Crawford (1983) and others (Murray and Arter, 1980; Campbell 
and Stanley, 1966) has demonstrated the problems in using such 
"underspecif ied" evaluation models far inferential purposes. In 
fact, the results from Model A analyses are best suited for 
descriptive purposes only (e.g., the mean NCE gain does tell us 
something about the perfor^mance level of the Chapter I students as 
a whole, but statements about the effectiveness of the program or 
any "treatment effect" cannot be based on simple NCE gain 
analyses). For a more complete description of the problems and 
pitfalls concerning the use of Model A analyses, see Murray and 
Arter (1980). 

Our approach has been to attempt to base any. inferences about 
Chapter! treatment effects on between-group studies. It is true 
that any quas 1 -experimental design employed will yield less-than- 
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. perfect internal validity, but this is still preferabje to the use 
of treatment-only raw gains analyses for inferential' purposes. 

Of the three models adopted from the Tallmadge and Wood (1976) 
effort, the l?etween-group model is most clearly represented as 
Model B. Since random assignment of students into treatment and 
comparison groups is not possible, the design is, by necessity, 
quasi-experimental. Echternacht (1980) noted that the Model B 
design "is rarely used because its use requires withholding Title I 
services from students who might nominally be expected to receive 
some compensatory program" (p. 6). However, in our district, it is 
possible to create "matched" groups by using similar, low 
achieving, low SES (socioeconomic status) students who are not 
receiving .Chapter I as the "status quo" or comparison group in 
Inferential analyses. The procedures employed in creating the 

. matched groups is described in the method section. 

• It should be stated that although we do attend tanr^ef ressi on 
lines and utilize regression analyses, we are not following 
the procedures of Model C (the special regression model) which 
calculates Title I students' variation from a regression line 
calculated on higher scoring students in the comparison group. 

As part of our application of Model B procedures, an Analysis 
of Covariance Model was used. This was done for several r.easons. 
Even with the matching procedure on prescores and SES, which causes 



o 

ERIC 



« 

o 

♦ t 

Aptitude-Treatment Interactions 

the covariance-adjusted means to be identical to the unadjusted, 
(raw) means, the ANCOVA procedure gives more "precise" estimates of 
effects (lower "Error" 'sums of squares).;. This is true whenever the 
pre-score is correlated significantly with post-score. And, 
perhaps more Importantly, the analyses required prior to ANCOVA 
address important substantive questions --namely, the 
"aptitude-treatment Interaction" (ATI) questions. The homogeneity 
of slopes test required prior to ANCOVA poses the question "Did the 
treatment affect students differently, depending on their pre- 
treatment achievement standing?" 

Even six-years after the publicat'ion of the handbook of ATIs 
(Cronbach and Snow, 1977),, it is not universally recognized by 
practicing evaluators that ylie classic "homogeneity of slopes" test 
is also the test of ATIs. _I_f the post-on-pre regression lines for 
the groups being compared are not parallel, and ^f^ those lines 
i-n^ersect, and -1_r "the" poTnt of inters corresponds to a reason- 

able pre-score (e.g., within the range of actual data), then there 
would be evidence that the program primarily benefitted students 
below or above the pre-score that corresponds to the point of 
intersection. Such a result, would have obvious implications for 
inferences about program effects, as well as impl Ications ,f or 
policy concerning student eligibility to receive the services. 
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What theoretical fdundation exists for the expectation that a 
compensatory educatiftn program such as Chapter I will show detec- 
table effects on student achievement scores? There is the work by 
John Carroll from two decades ago (Carroll, 1963) as well as recent 
empirical evidence from National Institute of Education (Denham and 
Lieberman, 1980) labs and centers that indicates that the amount of 
student engagement with academic content (or, "academic learning 
time") can be a significant predictor of how much the students 
learn. Therefore, if the time spent in Chapter I instruction is 
"extra" time spent on remediation of weaknesses in basic skills, 
'^<ine is directfy led to the prediction that program students' per- 
formance in basic skills should be greater than what it would be 
without the Chapter I instruction. Recent research in this 
district, (Crawford, Patrick, and Kimball, 1984) has also shown 
I'li^-Jl?"^-^!"^^^^ (though weak-X fwxsit4v^ Te-la^-ionships da exist be- 
tween the amount of time students spent in Chapter I labs and their 
subsequent achievement gains. However, neither raw ^NCE gains anal- 
yses, nor relationships between allocated and engaged time and 
achievement gains have addressed the question: How do Chapter I 
students' achievement gains compare with a local, "matched" sample 
of similar, low achieving, low SES, non-Chapter I treated students? 
The purpose of this study was to determine whether the Chapter I 



Aptitude-Treatment Interactions 

6 

Instruction provided to some of the low achieving students In the 
district appears to result In a detectable effect on student academ- 
ic growth. The Chapter I students were compared with a similar 
("matched") group of non-Chapter I students. The non-Chapter I 
students are also low achieving, but do not receive Chapter I ser- 
vices because they ^either go to school at si.tes which did not 
qualify as Chapter I schools, or were not served by Chapter I 
' within their* school . Qualification for free or reduced lunches Is 
dependenton family Income. Therefore, Chapter I and non-Chapter I 
sites differ In the average SES (socioeconomic status> of the 
attending students. Even If Chapter I and non-Chapter I comparison 
groups were selected that match on Initial ( pre-treatment ) achieve- 
ment, they would very likely still differ on SES. Fdr this reason, 
as desiLrlbed in ttve- ''Wethod"-^e<:t1on that follows, students in the 
treatment and cofnparlson groups were matched on both pre- 
achlevement measures and SES as measured by eligibility for free or 
reduced-payment lunches. 
Method 

Subjects . Al.l students entering Into these analyses (spanning 
, two school years, 1981-82 and 1982-83) were enrolled in the ' 
district for the entire school year. For the analyses of the ^ . 
•81-'82 year, each student had achievement scores from the 
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district-wide administration of the California Achievement Jest 
(CAT) in . May 1981 and May 1982. For the analyses *'of the •82-'83 
year, each student had May 1982 a^d May 1983 CAT scores. All were 
In grades 2-8. Both years' groups consisted of both Chapter I stu- 
dents and their matched or "yoked" non-Chapter I counterparts. 

Matching Procedure . As mentioned previously, to obtain a more 
precise measure of the Chapter I treatment effect, the students 
participating In Chapter I (district-wide) were "matched" with 
equivalent low-achlevin'g students who were not served by Chapter I. 
It should be reemphasized that Chapter T school sites are selected 
based on the percentage of student eligible for free or reduced- 
payment lunches. "Within the site, participation in Chapter I 1s 
based strictly on achievement scores. Therefore, in each Chapter I 
-sne"there~Xre~1ow-achieving students who are eligible for free or 

:> 

reduced-payment lunches as well as-those whose family income is 
above the cutoff. Therefore, this matching process was designed 
to produce Chapter I and non-Chapter I comparison groups that were* 
perfectly matched on prescores (means and variances) and SES. The 
Chapter I population was utilized as the "reference" group, and 
prescores and free lanch el Igibil ity' were compared with Individuals 
in the non-Chapter I group, student-by-student (within each grade), 
to build a sample of "yoked" Chapter I and non-Chapter I groups 
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who therefore have precisely the same N's, the same means, and the 
same variances on the prescore measures (and the same N of students 
receiving free and reduced-payment lunches). This matching process 
was accomplished separately for reading and math achievement 
scores, and resulted in comparison groups (Chapter I and 
non-Chapter I together) of 3,556 '(Reading 1982);. 3,150 (Math 1982); 
3.560 (Reading 1983); 3.134 (Math 1983). 

■ Procedure . The Chapter''! "treatment" was provided by the 
district's Cha|)ter I staff during the 1981-82 and 1982-83 academic 
years. Elementary students were pulled from their regular 
classroom for 30-50 minutes per day of reading inst<ruction ^pd/or 
30 minutes per day for math. In the elementary grades, the activ- 
ity that the students are pulled from to' go to the. Chapter I lab 
is typically not reading or math activity. Therefore, the time 
spent in Chapter I labs may be considered "extra" or supplementary 
time in instruction in the basic skills. In middle school grades, 
the situation varies somewhat between schools, although students 
were usually pulled from language arts classes for reading and from 
math c\J asses for Chapter I math instruction. Even so. the time 
spent in Chapter I is intended to be more focused on remediation in 
basic skills than time spent in the regular classroom. 
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The nature of the reading and math instruction in Chapter I is 
also t ua1 itati vely different from regular classroom instruction. 
The number of students with each Chapter I teacher is limited to 10 
In reading, and 12 in math. Paraprof essional aides are also 
employed in each Chapter.,1 Learning Center, and the goal Is to /• 
attend to stude-nts' individual needs through "one-to-one interac„- 
tions and discussion between the teacher and Individual students. 
Recent data (Crawford, 1983) collected in a. "process" evaluation 
(using objective classroom observation dafa) indicated that time 
spent in Learning Centers is largely academically-oriented and that 
there are relatively high rates of private one-to-one interactions 
concerning basic skills content. « ' ^ . 

. Instruments . The achievement data analyzed as pre and post- 
scores come from the California Achievement Test (CAT) published by 
CTB/McGraw-Hill, Monterrey, California. The prescores were derived 
from either the May, 1981, or May 1982, district-wide testing. 
The posttest (dependent variable) data came, from the May, 1982, or 
May, 19-83, testing with the CAT. The analyses were carried out for 
the 1981-82 and for the 1982-83 academic years (for 1981-82, pre- 
data were May, 1981 scores and for 1982-83, pre-data were May, 1982 
•scores). The scores that were used for analyses purposes were 
^ 0 t'a 1 math and total readi ng normal curve equivalent (NCE) scores. 
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NCE scores are conceptually similar to national percentiles, but 
have the advantage of being equal Interval Sprlng-to-SprIng data 
were examined since the non-Chapter I students wen^e not tested In 
the fall. In. the figures to follow, NCE scores were converted to 
'national percentiles for convenience of Interpretation. 
Results 



In this study, the h^ogenei tj^^f slopes test revealed a'slmi- 
lar pattern^ ATI for both years (1981-82 and 1982-83) and both 
subject areas (reading and math). The result was consistent across 
nearly all grades, as 23 out of 28 w.ithin-grade comparisons showed 
slope differences in the same directian as the overall result. 
Flgure^l shows the overal 1 result (Ignoring grade) for math and 
reading in 1981-82,. and Figure 2 presents the results for the 
1982-83 analysis. 
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. FIGURE 1 
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FIGURE 2 
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As Figures 1 and 2 show, the students who benefitted the most 
from Chapter I "treatment" were those who were Initially the lowest 
achievers. The pattern ATI's generally replicated across years 
and across grades. In 23 of the 28 withln-grade comparisons the 
Chapter 1 students had a lower slope and higher Intercept than 
the non-Chapter I students. Table 1 gives the withln-grade com- 
parisons of slopes for both years and both subject matters. 



14 



Aptitude-Treatment Interactions 
14 



TABLE i 

Slope for Chapter I (CI) and Non-Chapter I (NC!) 
Groups by Grade and by Year 





MATH 


READING 


Grade 


1981-82 


1982-83 


Grade 


1981 


-82 


1982-83 




NCI 


CI 


NCI 


CI 




NCI- 


CI 


NCI 


CI 


2 


.66 


.64 


.44 


.29 


2 


.58 


.50 


.42 


.33 


3 


.64 


.53 


.66 


.49 


3 


.59 


.57 


.70 


.57 


4 


.74 


.68 


.86 


.73 


4 


.92 


.79 


.81 


.78 


5 


.88 


.78 


.73 


.73* 


5 


.87 


.62 


.68 


.62 


6 


.76 


.59 


.85 


.61 


6 


.62 


.59 


.68 


.61 


7 


.69 


.71* 


.57 


. .44 


7 


.51 


.63* 


.60 


.67* 


8 


.78 


.43 


.50 


.69* 


8 


.73 


.60 


.42 


.37 - 








overall 
Slope 


f74 


.64 


.69 


.59 


OveraTT 
Slope 


.69 


.61 


.63 


.58 


Inter- 
cept 


11.15 


15.02 


12.12 


16.45 


Inter- 
cept 


11.88 


14.10 


13.32 


14.89 



♦Indicates grades with slope differences In opposite direction from 
the overall slope direction. 



As Indicated, there was a great deal of consistency across 
the grades In the direction of slope differences between the 
treatment (CI) groups and the comparison groups (NCI), with the 
exception of grade 7. Why grade 7 differs In 3 out of the 4 com- 
parisons Is not known. Nevertheless, there Is no a priori reason to 
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expect this degree of consistency (in 23 of 28 comparisons) in a 
complex student population such as this, spanning so many grades, 
and for two academic years. 

We have not presented tests of statistical significance of the 
slope differences using the F-distribution, nor have we employed 
"regions of significance" methods (Johnson and Neyman, 1936) to 
analyze the intersection of regression lines, since it may be argued 
that these data are basically "popul ation" data. The students 
entering into the analyses are all of the Chapter I students for - 
whom a match could be found (plus their matching, non-Chapter I 
counterparts). The reader may wonder about the size of the 
reported slope differences in comparison to the standard errors of 
the slopes, therefore the last figures of results were prepared (see 
Figures 3 and 4). For the overall analyses, we created confidence 
intervals around each observed slope; one at the + 1 SE level and 
one at the + 2 SE level. In three of the four comparisons, the 
Chapter I and non-Chapter I slopes did not overlap with + 1 SE 
around the observed slopes, but did (barely) overlap when the + 2 
SE confidence interval was used. Only reading scores for '82-'83 
showed overlap with + 1 SE. There findings tend to support the 
consideration that the between-group slopes are different. 
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Figure 3 '» 



1981-82 MATH 


1981-82 READING 


Cl Group 


CI Group 


(raw) Slopes .61^ 
Intercepts 15.02 ' 
se Slopes .029 


(raw) Slope* .61 
Intercepts^. Ik.l 
se Slopes .025 


NCI Group 


NCI Group 


(raw) Slope= .7'; 
Intercept* 11.15 
se Slope= .029 


(raw) Slopes .69 ' 
Intei-cfepts '11.88 
se Slopes .025 


1981-82 MATH - 




CI Group 




(raw) Slopes .6k 


+ 1 se .6l'l^ J- 


, .669 






NCI Group 




(raw) Slopes 




+ 1 se 


.711, , .769 


+ 2 se 





1981-82 READING 

CI Group 

(raw) Slopes. 61 

+ 1 se .58 5 ^ , ^_ .^-^5 
+2 gfi-5f^ ^ , ^ .66 

NCI Group 

. (raw) Slope* .69 
± 1 se .66'; , , ^ .715 
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1982-83 MATH 


1982-83 REAPING 


CI Group 


CI Group 


(raw) Slopes .59 
Intercepts lo • 4 5 
se Slopes^ ,03 


(raw)' Slopes .58 
Intercepts il».89 
se Slopes .027 


NCI Group 


NCI Group 


(raw) Slopes .69 
Intercepts 12.12 
se Slopes .03 


(raw) Slopes .63 
Intercepts 13,32 
se Slopes .027 


1982-83 MATH 




CI Group 




(raw) Slopes 


.59 


+ 1 se c^^; ^ , 


.62 


f 2 ae .'5? , 


» .65 



NCI Group , 

(raw) Slope* 

i 1 se 
+ 2 ae 



.66 
.63 ^ 



.69 



.72 



.75 



1982-83 READING 
CI Group 

(raw) Slopes .58 

± 1 se .5^3 ^ . ^ .607 
± 2 se .526 ^ ^ 



.63k 
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NCI Group 

(raw) Slopes 
± 1 se .603. 
± 2 ae .576.- 



.63 



.657 



.681* 
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. Discussion . From a scientific perspective, the importance of 

these results is that they indicate that the "treatment" can affect 

students diff erent-ially within a compensatory program -- in this 

case, initially lowe r achievers apparently benefit more from-treat- 

^ ■ ■ t i ■ 

ment than the initially higher achievers. Accordingly, a simple 
between-group t-test, or even an ANCOVA applied naively would not 
have detected these differences. Obviously, the "homogenity of 
slopes" test is crucial, and has to be attended to prior to uti- 
lizing ANCOVA techniques. This preliminary test is mor'e than, mere 
statistical prelude to ANCOVA it addresses important substantive 
questions as wel 1 , 

The study also had major implications for policy decisioTi- 
makers. In these times of dwindling incoming resources and 
increased program costs, the point of intersection of the treatment 
slopes provided .administrators with relevant data to modify the 
selection rule for the program, so that only those students who 
were most likely to benefit. from ^tJie program would be served. This 
district had historically used the 40th percentile (nationally) as 
the "selection rule" or cutoff in both reading and math across the 
district. Within each designated Chapter I site, any student 
scoring lower than the 40th percentile was eligible for inclusion 
into the program. This often resulted in long waiting lists at 
some schools, because the number of students that were eligible 
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exceeded the maximum allowable as defined by the program regula- 
tions. With lower pre-score cutoffs for eligibility, the program 
comes closer to serving all those who qualify. 

In math, in the 1981-82 data, the point of intersection indi- 
catedthat the students below the 30th percentile (nationally) 
showed the most achievement gain due to Chapter I participation. 
In the 1982-83 math data, th,e students below the 39th percentile* 
appeared to benefit the most from the program; The recommendation 
for modifying the selection rule for math essentially involved 
"splitting the difference" between the 1981-82 and 1982-83 -- ' 
hence, it was recommended that the 35th percentile be adopted as 
the new cutoff. 

In reading, in the 1981-82 data, the results showed that stu- 
dents initially scoring below the 15th percentile were the ones 
most benefitting from the Chapter I services. For reading, in the 
results for 1982-83,, the percentile cutoff below which the Chapter I 
regression line was higher than the non-Chapter I line was the 
18Ui, Administrators chose a cutoff for inclusion in Chapter I 
reading at a "convenient" value (the 20th percentile) just above 
the cutoff points indicated by the data. The reason for choosing a 

♦Even though policy stated that the 40th percentile ws the cutoff, 
some students were served with prescores up to the 50th perQentile, 
so there were students in the analyses beyond the 39th percentile.* 
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slightly higher value than suggested by the results for reading 
was to have an easy-to-implement eligibility rule, and because 
reading-is recognized as a priority skilJ, and decision-makers 
would rather err in the direction of including a few too many stu- 
dents than perhaps exclude some students who may benefit from the 
treatment. 

* * 

What psycholojlcal or pedagogical phenomenon could account for 
the observed results? Recent information-processing theory 
(Sternberg, 1984) has turned attention to "knowledge-acquisition" 
components (as. distinct from "performance" components and other, 
higher-level "metacomponents" ) . Our. findings may be due to the 
f^ct that different learner types may engage their 
"knowledge-acquisition" components quite differently. For 
illustrative purposes consider an oversimplified example. 

Learner type #1 is at the 7th percentile overall in reading. 
This student has serious problems with most sub-areas included as 
part of Total Reading. However, learner type #2 is at the 35th 

.i 

percentile nationally, and is basically an "average" student In all 
reading subtests except for one, perhaps phonics, which has a par- 
ticularly low score (that brought the overall reading score down to 

OCX 

35J. The t>pe of treatment "needed" by thes-s two types of students 
might be quite different. 
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.Student type #1 needs individual attention,.' In order to maxi- 
mize his or her acquisition of knowledge about reading, a highly 
structured environment works best. The small class size and pre- 
sence of 2 adults with only 8-10 students suit this student's needs 
quite well. He or she will gain the most if placed in a Chapter I 
Learning Center. '~ 

Student type #2 scores around the 50th percentile in -all 
reading sub-areas except for phonics. This student is basically 
functioning normally in the regular classroom. He or she mainly 
needs some extra help in one area phonics. The gains of this 
student may be maximized by leaving him or her in the regular . 
classroom. For this student, the benefits of the small class size 
and individual attention in Chapter I do no^ outwe.i,gh the costs 
incurred by losing the continuity of regular classroom Instruction. 
Such an explanation would imply that certain educational 
'effects sometimes "compete" (and, further, that the nature^and out- 
come of that competition varies from one type of learner .t6 
another). Student type #1 responds so well to the high structure 
of the Chapter I Learning Center that the loss of continuity in 
regul%r classroom instruction is more than overcome. However, the 
type #2 student does not benefit enough from that (Learning Center) 
environment to fully compensate for the loss of what would have 
been gained from staying in the regular classroom. 
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Specifying this "cost benefit" conception of Chapter I effects 
in combination with regular classroom effects requires future 
research . 
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